Assignment 1¶
Part 1¶
a) In your opinion, what were the most important turning points in the history of deep learning?¶
The most important turning points for deep learning, would in my opinion, be the computational increase with the usage of GPUs, the usage of backpropagation, and in the recent years the use of transformers.
The development of faster hardware meant that training could be done at a much faster rate and larger datasets could be utilized. AlexNet was one of the cornerstones in the GPU boom by using two GTX 580's GPUs training a convolutional neural network.
An earlier turning point, in the 1980s, was the formation of backpropagation which would allow the networks to learn from their predictions errors and adjust the weigths accordingly.
Lastly, in recent years, the advancement of transformer-based architecture has given a new boost to deep learning, especially leading to the development of generative pre-trained transformers (GPTs), as seen in models like "ChatGPT".
b) Explain the ADAM optimizer.¶
The Adaptive Moment Estimation (ADAM) optimizer is used to update the weights in a network by combining the ideas of momentum and adaptive learning rates. It works by maintaining two moving averages: one for the gradient (momentum term) and another for the squared gradient (RMSprop term). These averages are bias-corrected, and weights are updated using both terms to adapt the learning rate for each parameter.
c) Assume data input is a single 30x40 pixel image. First layer is a convolutional layer with 5 filters, with kernel size 3x2, step size (1,1) and padding='valid'. What are the output dimensions?¶
The output dimensions is given by $$\frac{W-K+2P}{S}+1$$ where $W$ is the input volume size, $K$ is the kernel size, $S$ the stride, and $P$ the amount of padding. So we have the following: $$\text{Output Height} = \frac{W-K+2P}{S}+1 = \frac{30-3+2\cdot0}{1}+1=\frac{27}{1}+1=28$$ $$\text{Output Width} = \frac{W-K+2P}{S}+1 = \frac{40-2+2\cdot0}{1}+1=\frac{38}{1}+1=39$$ and since the convolutional layer has 5 filters, the output will have 5 channels. Hence, the output dimensions will be 28x39x5.
d) Assuming ReLU activations and offsets, and that the last layer is softmax, how many parameters does this network have:¶
The number of parameters is calculated by looking at the number of layers and number of neurons in each layer. We have 1 input layer, 3 hidden layers, and 1 output layer. Since its a fully connected network the 5 input neurons connect with each of the 1st hidden layer neuron, and so on. Hence, we get the following weights
$$\text{Weights} = 5\times5+5\times5+5\times5+5\times3 = 90$$
We, also, need to look at the biases present for each neuron after the input layer.
$$\text{Biases} = 5+5+5+3 = 18$$
In total we have the following number of parameters:
$$\text{Total} = 90+18 = 109$$
e) For a given minibatch, the targets are [1,4, 5, 8] and the network output is [0.1,4.4,0.2,10]. If the loss function is "torch.nn.HuberLoss(reduction='mean', delta=1.0)", what is the loss for this minibatch?¶
The Huber loss function is defined by the following piecewise $$L_\delta(y-\hat{y}) = \left\{ \begin{array}{ll} \frac{1}{2}(y-\hat{y})^2 & |y-\hat{y}|\le\delta\\ \delta (|y-\hat{y}|-\frac{1}{2}\delta) & \text{otherwise.} \\ \end{array} \right.$$ where $y$ is the target, $\hat{y}$ is the predicted output, $\delta$ the threshold. With the targets $y=[1,4,5,8]$ and predicted outputs $\hat{y}=[0.1,4.4,0.2,10]$, we can compute the loss for each. First we determine each of the cases. $$\begin{aligned} |y-\hat{y}|=|1-0.1|=0.9 \\ |y-\hat{y}|=|4-4.4|=0.4 \\ |y-\hat{y}|=|5-0.2|=4.8 \\ |y-\hat{y}|=|8-10|=2.0 \end{aligned}$$ then the losses $$\begin{aligned} L(1,0.1)=\frac{1}{2}(1-0.1)^2=0.405 \\ L(4,4.4)=\frac{1}{2}(4-4.4)^2=0.08 \\ L(5,0.2)=1\cdot(|5-0.2|-\frac{1}{2}\cdot1) = 4.3 \\ L(8,10)=1\cdot(|8-10|-\frac{1}{2}\cdot1) = 1.5 \end{aligned}$$ In total we have the mean loss as $$\frac{0.405+0.08+4.3+1.5}{4}=1.57$$ This can also be done with torch by the following code:
import torch
# Targets and outputs
targets = torch.tensor([1, 4, 5, 8])
outputs = torch.tensor([0.1, 4.4, 0.2, 10])
# Define the Huber loss with delta = 1.0
huber_loss = torch.nn.HuberLoss(reduction='mean', delta=1.0)
# Calculate the loss
loss = huber_loss(outputs, targets)
print(loss.item())
1.571250081062317
Part 2¶
import os
import pandas as pd
from torchvision.io import read_image
import numpy as np
import torch
from torch import nn
from torch.utils.data import DataLoader
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt
from torchsummary import summary
import torch_directml
from PIL import Image
Dataset builder¶
class InsectDataset(Dataset):
def __init__(self, annotations_file, img_dir, root_dir, transform=None):
""""
directory setup of the images and labels
root_dir: Main data directory
annotations_file: csv file
img_dir: image directory
"""
self.root_dir = root_dir
annotations_path = os.path.join(self.root_dir, annotations_file)
self.img_labels = pd.read_csv(annotations_path)
self.img_dir = img_dir
self.transform = transform
def __len__(self):
return len(self.img_labels)
def __getitem__(self, idx):
"""
Retrieve image via filename
Open image
Retrieve label
transform if needed
return image and label
"""
img_path = os.path.join(self.root_dir,self.img_dir, self.img_labels.iloc[idx, 2])
# print(f"Loading image from: {img_path}") # Debug print
image = Image.open(img_path)
label = self.img_labels.iloc[idx, 1] # Retrieve label
# print(f"----------Labelled = {label}\n") # Debug print
if self.transform:
image = self.transform(image)
return image, label
transform = transforms.Compose([
transforms.Resize((520,520)),
transforms.ToTensor()])
batch_size = 4
# Set up the dataset.
dataset = InsectDataset(annotations_file='insects.csv', img_dir='Insects',root_dir='Data/', transform=transform)
# Set up the dataset.
trainloader = torch.utils.data.DataLoader(dataset,
batch_size=batch_size,
shuffle=True,
num_workers=0)
# get some images
dataiter = iter(trainloader)
images, labels = next(dataiter)
for i in range(5): #Run through 5 batches
images, labels = next(dataiter)
for image, label in zip(images,labels): # Run through all samples in a batch
plt.figure()
plt.imshow(np.transpose(image.numpy(), (1, 2, 0)))
plt.title(label)
Part 3¶
a)¶
The data given is already described and I, therefore, just go straight ahead and extract the features and labels from both the trainData and testData ".txt" files. This is easily done using pandas.
import pandas as pd
# Load the data from .txt files
train_data = pd.read_csv('Data/Part3/trainData.txt', sep='\s+', header=None)
test_data = pd.read_csv('Data/Part3/testData.txt', sep='\s+', header=None)
# The first column is the label, and the second and third columns are the features
train_labels = train_data.iloc[:, 0] # Labels (first column)
train_features = train_data.iloc[:, 1:] # Features (second and third columns)
test_labels = test_data.iloc[:, 0] # Labels (first column)
test_features = test_data.iloc[:, 1:] # Features (second and third columns)
# Print the first few rows to verify the data
print(train_data.head())
# Define a function to visualize the data
def plot_data(features, labels, title):
plt.figure(figsize=(8, 6))
unique_labels = np.unique(labels) # Get unique class labels
colors = plt.cm.viridis(np.linspace(0, 1, len(unique_labels))) # Generate colors
# Plot each class with its own color and label
for label, color in zip(unique_labels, colors):
idx = labels.to_numpy() == label # Get indices of the current label
plt.scatter(features.iloc[idx, 0], features.iloc[idx, 1], label=f'Class {label}', color=color, edgecolor='k')
plt.title(title)
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.legend(title="Class Labels",loc="upper right")
plt.grid(True)
plt.show()
# Plot the training data
plot_data(train_features, train_labels, "Training Data Visualization")
# Plot the test data
plot_data(test_features, test_labels, "Test Data Visualization")
0 1 2 0 2.0 0.243584 0.539536 1 0.0 0.029800 0.074531 2 4.0 -0.437585 -0.383632 3 2.0 -0.224602 0.407026 4 3.0 0.284853 0.800316
The plots then shows a spiraling arms like a galaxy. We Can easily see the datapoints being classified with their corresponding labels.
b)¶
- Describe your network
The network need to take the two inputs and expand them for more parameters before shrieking to the 5 classes. Therefore, a number of linear layers is needed starting with a input layer taking 2 input features and outputting 256, then a hidden from 256 to 128 and 128 to 64, and then a output layer going from 64 to the 5 classes. Each layer is being followed by a ReLU activation function. - Describe your training strategy
The network is trained by the Adam optimizer and Cross Entropy loss function with appropriate values for learning rate and batch size parameters. This strategy was build using our prior code from week 3 and 4 where we have seen how certain optimizers and loss functions behave and that the Adam optimizer and Cross Entropy loss function often is the prefered choice due to being efficent.
Training is performed in batches of 32 with a learning rate of 0.001 running for 100 epochs or until 10 consecutive epochs with no improvement in the loss value.
class NeuralNet(nn.Module):
def __init__(self, input_size, number_classes):
super().__init__()
# Linear Layers
self.linear1 = nn.Linear(input_size, 256)
self.linear2 = nn.Linear(256, 128)
self.linear3 = nn.Linear(128, 64)
self.linear4 = nn.Linear(64, number_classes)
# Activation Function
self.relu = nn.ReLU()
def forward(self, x):
x = self.linear1(x)
x = self.relu(x)
x = self.linear2(x)
x = self.relu(x)
x = self.linear3(x)
x = self.relu(x)
x = self.linear4(x)
return x
# Hyperparameters
input_size = 2
number_classes = 5
batch_size = 32
lr = 0.001
# Device
device = "cpu"
# Load the training/test data/labels to tensor's
training_data = torch.tensor(train_features.values, dtype=torch.float32)
training_labels = torch.tensor(train_labels.values, dtype=torch.long)
testing_data = torch.tensor(test_features.values, dtype=torch.float32)
testing_labels = torch.tensor(test_labels.values, dtype=torch.long)
# Combine to a dataset
train_dataset = torch.utils.data.TensorDataset(training_data, training_labels)
test_dataset = torch.utils.data.TensorDataset(testing_data, testing_labels)
# Load DataLoader
train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True,pin_memory=True)
test_dataloader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True,pin_memory=True)
# Neural Network
model = NeuralNet(input_size, number_classes).to(device)
# Optimizer and Loss
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
class trainNN:
def __init__(self, model, train_dataloader, test_dataloader, loss_fn, optimizer, num_epochs, patience):
"""
Inputs
"""
self.model = model
self.train_dataloader = train_dataloader
self.test_dataloader = test_dataloader
self.loss_fn = loss_fn
self.optimizer = optimizer
self.num_epochs = num_epochs
self.patience = patience
self.reset()
def reset(self):
"""
reset function. This includes weigths and parameters.
"""
self.train_losses = []
self.test_losses = []
self.train_accuracies = []
self.test_accuracies = []
self.number_epochs = 1
def reset_weights(m):
if hasattr(m, 'reset_parameters'):
m.reset_parameters()
self.model.apply(reset_weights)
self.optimizer = torch.optim.Adam(self.model.parameters(), lr=lr)
def _train(self, dataloader, model, loss_fn, optimizer):
"""
Training function
"""
size = len(dataloader.dataset)
model.train()
total_loss, correct = 0, 0 # Track the total loss for the epoch
for batch, (X, y) in enumerate(dataloader):
X, y = X.to(device), y.to(device)
# Compute prediction error
pred = model(X)
loss = loss_fn(pred, y)
total_loss += loss.item() # Accumulate loss
# Backpropagation
model.zero_grad()
loss.backward()
optimizer.step()
# Calculate accuracy
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
# Print loss for the first batch and then every 5th batch
if batch == 0 or (batch + 1) % 5 == 0 or (batch + 1) == len(dataloader):
current = (batch + 1) * len(X) if (batch + 1) < len(dataloader) else size
print(f"loss: {loss.item():>7f} [{current:>5d}/{size:>5d}]")
avg_loss = total_loss / len(dataloader) # Calculate average loss for the epoch
avg_accuracy = correct / size
self.train_losses.append(avg_loss) # Store the average loss for the epoch
self.train_accuracies.append(avg_accuracy) # Store the average accuracy for the epoch
print(f"Train loss: {avg_loss:>7f}, Accuracy: {(100*avg_accuracy):>0.1f}%")
def _test(self, dataloader, model, loss_fn):
"""
Test function
"""
size = len(dataloader.dataset)
num_batches = len(dataloader)
model.eval()
test_loss, correct = 0, 0
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
pred = model(X)
test_loss += loss_fn(pred, y).item()
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
test_loss /= num_batches
correct /= size
self.test_losses.append(test_loss) # Store the test loss for the epoch
self.test_accuracies.append(correct)
print(f"Test Error: \n Avg loss: {test_loss:>8f}, Accuracy: {(100*correct):>0.1f}% \n")
return round(test_loss,3)
def _plot_results(self):
"""
Plotting function
"""
plt.subplot(2,1,1)
plt.plot(range(1, self.number_epochs+1), self.train_losses, label="Train Loss")
plt.plot(range(1, self.number_epochs+1), self.test_losses, label="Test Loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.title("Training and Testing Loss over Epochs")
plt.legend()
plt.show()
plt.subplot(2,1,2)
plt.plot(range(1, self.number_epochs+1), self.train_accuracies, label="Train Accuracy")
plt.plot(range(1, self.number_epochs+1), self.test_accuracies, label="Test Accuracy")
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.title("Training and Testing Accuracy over Epochs")
plt.legend()
plt.show()
def TrainModel(self):
"""
Execute training function
"""
# Early stopping parameters
best_test_loss = float('inf')
epochs_without_improvement = 0
for t in range(self.num_epochs):
print(f"Epoch {t+1}\n-------------------------------")
# Print number of batches only for the first epoch
if t == 0:
num_batches = len(self.train_dataloader)
print(f"Number of batches: {num_batches}")
self._train(train_dataloader, model, loss_fn, optimizer)
current_test_loss = self._test(test_dataloader, model, loss_fn)
# Early stopping logic
if current_test_loss < best_test_loss:
best_test_loss = current_test_loss
epochs_without_improvement = 0
else:
epochs_without_improvement += 1
if epochs_without_improvement >= self.patience:
print(f"Early stopping triggered after {self.number_epochs+1} epochs.")
break
print(f"Epochs without improvement: {epochs_without_improvement+1}")
self.number_epochs += 1
print("Training completed.")
self._plot_results()
train = trainNN(model, train_dataloader, test_dataloader, loss_fn, optimizer, 100, 10)
train.reset()
train.TrainModel()
Epoch 1 ------------------------------- Number of batches: 38 loss: 1.634999 [ 32/ 1200] loss: 1.589952 [ 160/ 1200] loss: 1.563196 [ 320/ 1200] loss: 1.512319 [ 480/ 1200] loss: 1.495391 [ 640/ 1200] loss: 1.545000 [ 800/ 1200] loss: 1.486723 [ 960/ 1200] loss: 1.521180 [ 1120/ 1200] loss: 1.330905 [ 1200/ 1200] Train loss: 1.522027, Accuracy: 27.5% Test Error: Avg loss: 1.399943, Accuracy: 29.4% Epochs without improvement: 1 Epoch 2 ------------------------------- loss: 1.410996 [ 32/ 1200] loss: 1.400629 [ 160/ 1200] loss: 1.451725 [ 320/ 1200] loss: 1.394114 [ 480/ 1200] loss: 1.368460 [ 640/ 1200] loss: 1.388224 [ 800/ 1200] loss: 1.392168 [ 960/ 1200] loss: 1.234187 [ 1120/ 1200] loss: 1.165876 [ 1200/ 1200] Train loss: 1.354958, Accuracy: 32.8% Test Error: Avg loss: 1.252620, Accuracy: 33.4% Epochs without improvement: 1 Epoch 3 ------------------------------- loss: 1.365736 [ 32/ 1200] loss: 1.259524 [ 160/ 1200] loss: 1.075631 [ 320/ 1200] loss: 1.016544 [ 480/ 1200] loss: 1.094023 [ 640/ 1200] loss: 0.964274 [ 800/ 1200] loss: 1.175943 [ 960/ 1200] loss: 1.013033 [ 1120/ 1200] loss: 0.828123 [ 1200/ 1200] Train loss: 1.169105, Accuracy: 43.5% Test Error: Avg loss: 1.047375, Accuracy: 53.5% Epochs without improvement: 1 Epoch 4 ------------------------------- loss: 1.039672 [ 32/ 1200] loss: 1.104955 [ 160/ 1200] loss: 0.899412 [ 320/ 1200] loss: 1.009252 [ 480/ 1200] loss: 0.842854 [ 640/ 1200] loss: 0.933371 [ 800/ 1200] loss: 0.750719 [ 960/ 1200] loss: 0.758873 [ 1120/ 1200] loss: 0.516639 [ 1200/ 1200] Train loss: 0.928486, Accuracy: 63.4% Test Error: Avg loss: 0.783432, Accuracy: 74.2% Epochs without improvement: 1 Epoch 5 ------------------------------- loss: 0.836481 [ 32/ 1200] loss: 0.895845 [ 160/ 1200] loss: 0.697209 [ 320/ 1200] loss: 0.581662 [ 480/ 1200] loss: 0.902795 [ 640/ 1200] loss: 0.533425 [ 800/ 1200] loss: 0.512100 [ 960/ 1200] loss: 0.578260 [ 1120/ 1200] loss: 0.587026 [ 1200/ 1200] Train loss: 0.683279, Accuracy: 76.0% Test Error: Avg loss: 0.590661, Accuracy: 75.6% Epochs without improvement: 1 Epoch 6 ------------------------------- loss: 0.522035 [ 32/ 1200] loss: 0.386084 [ 160/ 1200] loss: 0.731761 [ 320/ 1200] loss: 0.636032 [ 480/ 1200] loss: 0.502901 [ 640/ 1200] loss: 0.561835 [ 800/ 1200] loss: 0.525176 [ 960/ 1200] loss: 0.452210 [ 1120/ 1200] loss: 0.734804 [ 1200/ 1200] Train loss: 0.526214, Accuracy: 80.8% Test Error: Avg loss: 0.441090, Accuracy: 83.3% Epochs without improvement: 1 Epoch 7 ------------------------------- loss: 0.378420 [ 32/ 1200] loss: 0.363965 [ 160/ 1200] loss: 0.349269 [ 320/ 1200] loss: 0.477257 [ 480/ 1200] loss: 0.335784 [ 640/ 1200] loss: 0.294459 [ 800/ 1200] loss: 0.319002 [ 960/ 1200] loss: 0.327112 [ 1120/ 1200] loss: 0.288813 [ 1200/ 1200] Train loss: 0.415695, Accuracy: 84.5% Test Error: Avg loss: 0.349538, Accuracy: 86.6% Epochs without improvement: 1 Epoch 8 ------------------------------- loss: 0.526820 [ 32/ 1200] loss: 0.372484 [ 160/ 1200] loss: 0.351093 [ 320/ 1200] loss: 0.186322 [ 480/ 1200] loss: 0.512933 [ 640/ 1200] loss: 0.223222 [ 800/ 1200] loss: 0.257140 [ 960/ 1200] loss: 0.211226 [ 1120/ 1200] loss: 0.355030 [ 1200/ 1200] Train loss: 0.328076, Accuracy: 86.8% Test Error: Avg loss: 0.303947, Accuracy: 89.6% Epochs without improvement: 1 Epoch 9 ------------------------------- loss: 0.402547 [ 32/ 1200] loss: 0.373510 [ 160/ 1200] loss: 0.375409 [ 320/ 1200] loss: 0.206043 [ 480/ 1200] loss: 0.308058 [ 640/ 1200] loss: 0.243929 [ 800/ 1200] loss: 0.318912 [ 960/ 1200] loss: 0.242051 [ 1120/ 1200] loss: 0.350981 [ 1200/ 1200] Train loss: 0.282649, Accuracy: 90.1% Test Error: Avg loss: 0.280351, Accuracy: 90.3% Epochs without improvement: 1 Epoch 10 ------------------------------- loss: 0.213580 [ 32/ 1200] loss: 0.278102 [ 160/ 1200] loss: 0.231072 [ 320/ 1200] loss: 0.220518 [ 480/ 1200] loss: 0.290559 [ 640/ 1200] loss: 0.205397 [ 800/ 1200] loss: 0.252911 [ 960/ 1200] loss: 0.157955 [ 1120/ 1200] loss: 0.142582 [ 1200/ 1200] Train loss: 0.234896, Accuracy: 91.1% Test Error: Avg loss: 0.226537, Accuracy: 91.3% Epochs without improvement: 1 Epoch 11 ------------------------------- loss: 0.167467 [ 32/ 1200] loss: 0.144003 [ 160/ 1200] loss: 0.162012 [ 320/ 1200] loss: 0.224904 [ 480/ 1200] loss: 0.204073 [ 640/ 1200] loss: 0.219121 [ 800/ 1200] loss: 0.173227 [ 960/ 1200] loss: 0.196963 [ 1120/ 1200] loss: 0.141362 [ 1200/ 1200] Train loss: 0.200145, Accuracy: 93.8% Test Error: Avg loss: 0.247672, Accuracy: 92.0% Epochs without improvement: 2 Epoch 12 ------------------------------- loss: 0.149835 [ 32/ 1200] loss: 0.147628 [ 160/ 1200] loss: 0.212771 [ 320/ 1200] loss: 0.062943 [ 480/ 1200] loss: 0.148714 [ 640/ 1200] loss: 0.183984 [ 800/ 1200] loss: 0.084565 [ 960/ 1200] loss: 0.278355 [ 1120/ 1200] loss: 0.162121 [ 1200/ 1200] Train loss: 0.178507, Accuracy: 94.2% Test Error: Avg loss: 0.178835, Accuracy: 92.3% Epochs without improvement: 1 Epoch 13 ------------------------------- loss: 0.224077 [ 32/ 1200] loss: 0.164845 [ 160/ 1200] loss: 0.182724 [ 320/ 1200] loss: 0.099063 [ 480/ 1200] loss: 0.108574 [ 640/ 1200] loss: 0.171978 [ 800/ 1200] loss: 0.329342 [ 960/ 1200] loss: 0.086470 [ 1120/ 1200] loss: 0.077369 [ 1200/ 1200] Train loss: 0.153793, Accuracy: 95.0% Test Error: Avg loss: 0.150579, Accuracy: 94.3% Epochs without improvement: 1 Epoch 14 ------------------------------- loss: 0.193030 [ 32/ 1200] loss: 0.220585 [ 160/ 1200] loss: 0.189236 [ 320/ 1200] loss: 0.164166 [ 480/ 1200] loss: 0.091198 [ 640/ 1200] loss: 0.114371 [ 800/ 1200] loss: 0.159097 [ 960/ 1200] loss: 0.055403 [ 1120/ 1200] loss: 0.168787 [ 1200/ 1200] Train loss: 0.134924, Accuracy: 96.1% Test Error: Avg loss: 0.163710, Accuracy: 95.0% Epochs without improvement: 2 Epoch 15 ------------------------------- loss: 0.121402 [ 32/ 1200] loss: 0.106157 [ 160/ 1200] loss: 0.118167 [ 320/ 1200] loss: 0.085395 [ 480/ 1200] loss: 0.085999 [ 640/ 1200] loss: 0.038718 [ 800/ 1200] loss: 0.073532 [ 960/ 1200] loss: 0.143075 [ 1120/ 1200] loss: 0.244079 [ 1200/ 1200] Train loss: 0.127650, Accuracy: 95.8% Test Error: Avg loss: 0.139334, Accuracy: 95.3% Epochs without improvement: 1 Epoch 16 ------------------------------- loss: 0.148170 [ 32/ 1200] loss: 0.085191 [ 160/ 1200] loss: 0.102690 [ 320/ 1200] loss: 0.214958 [ 480/ 1200] loss: 0.078295 [ 640/ 1200] loss: 0.016274 [ 800/ 1200] loss: 0.060520 [ 960/ 1200] loss: 0.131179 [ 1120/ 1200] loss: 0.043854 [ 1200/ 1200] Train loss: 0.117211, Accuracy: 96.6% Test Error: Avg loss: 0.132165, Accuracy: 95.3% Epochs without improvement: 1 Epoch 17 ------------------------------- loss: 0.031994 [ 32/ 1200] loss: 0.116437 [ 160/ 1200] loss: 0.082504 [ 320/ 1200] loss: 0.161399 [ 480/ 1200] loss: 0.072663 [ 640/ 1200] loss: 0.196777 [ 800/ 1200] loss: 0.048010 [ 960/ 1200] loss: 0.171398 [ 1120/ 1200] loss: 0.030964 [ 1200/ 1200] Train loss: 0.110198, Accuracy: 96.3% Test Error: Avg loss: 0.117810, Accuracy: 96.3% Epochs without improvement: 1 Epoch 18 ------------------------------- loss: 0.144806 [ 32/ 1200] loss: 0.060612 [ 160/ 1200] loss: 0.152887 [ 320/ 1200] loss: 0.077791 [ 480/ 1200] loss: 0.056999 [ 640/ 1200] loss: 0.148802 [ 800/ 1200] loss: 0.059945 [ 960/ 1200] loss: 0.063936 [ 1120/ 1200] loss: 0.067174 [ 1200/ 1200] Train loss: 0.092211, Accuracy: 97.2% Test Error: Avg loss: 0.108297, Accuracy: 96.0% Epochs without improvement: 1 Epoch 19 ------------------------------- loss: 0.125071 [ 32/ 1200] loss: 0.131941 [ 160/ 1200] loss: 0.046590 [ 320/ 1200] loss: 0.105941 [ 480/ 1200] loss: 0.071034 [ 640/ 1200] loss: 0.141952 [ 800/ 1200] loss: 0.034680 [ 960/ 1200] loss: 0.059010 [ 1120/ 1200] loss: 0.164604 [ 1200/ 1200] Train loss: 0.087451, Accuracy: 97.4% Test Error: Avg loss: 0.099505, Accuracy: 96.3% Epochs without improvement: 1 Epoch 20 ------------------------------- loss: 0.109016 [ 32/ 1200] loss: 0.180641 [ 160/ 1200] loss: 0.033990 [ 320/ 1200] loss: 0.093682 [ 480/ 1200] loss: 0.026118 [ 640/ 1200] loss: 0.024688 [ 800/ 1200] loss: 0.156485 [ 960/ 1200] loss: 0.082114 [ 1120/ 1200] loss: 0.081675 [ 1200/ 1200] Train loss: 0.080083, Accuracy: 97.3% Test Error: Avg loss: 0.117504, Accuracy: 95.3% Epochs without improvement: 2 Epoch 21 ------------------------------- loss: 0.075265 [ 32/ 1200] loss: 0.075027 [ 160/ 1200] loss: 0.015881 [ 320/ 1200] loss: 0.256996 [ 480/ 1200] loss: 0.165373 [ 640/ 1200] loss: 0.136501 [ 800/ 1200] loss: 0.103423 [ 960/ 1200] loss: 0.053603 [ 1120/ 1200] loss: 0.014133 [ 1200/ 1200] Train loss: 0.084095, Accuracy: 97.2% Test Error: Avg loss: 0.087348, Accuracy: 97.3% Epochs without improvement: 1 Epoch 22 ------------------------------- loss: 0.014095 [ 32/ 1200] loss: 0.128571 [ 160/ 1200] loss: 0.075538 [ 320/ 1200] loss: 0.040471 [ 480/ 1200] loss: 0.103803 [ 640/ 1200] loss: 0.116821 [ 800/ 1200] loss: 0.159726 [ 960/ 1200] loss: 0.136538 [ 1120/ 1200] loss: 0.080300 [ 1200/ 1200] Train loss: 0.080746, Accuracy: 97.3% Test Error: Avg loss: 0.086320, Accuracy: 97.7% Epochs without improvement: 1 Epoch 23 ------------------------------- loss: 0.062882 [ 32/ 1200] loss: 0.142306 [ 160/ 1200] loss: 0.111712 [ 320/ 1200] loss: 0.160835 [ 480/ 1200] loss: 0.041645 [ 640/ 1200] loss: 0.044973 [ 800/ 1200] loss: 0.055384 [ 960/ 1200] loss: 0.032326 [ 1120/ 1200] loss: 0.055863 [ 1200/ 1200] Train loss: 0.066261, Accuracy: 98.0% Test Error: Avg loss: 0.091306, Accuracy: 97.3% Epochs without improvement: 2 Epoch 24 ------------------------------- loss: 0.088212 [ 32/ 1200] loss: 0.037720 [ 160/ 1200] loss: 0.025180 [ 320/ 1200] loss: 0.098655 [ 480/ 1200] loss: 0.086353 [ 640/ 1200] loss: 0.057978 [ 800/ 1200] loss: 0.076532 [ 960/ 1200] loss: 0.066723 [ 1120/ 1200] loss: 0.058987 [ 1200/ 1200] Train loss: 0.063312, Accuracy: 98.4% Test Error: Avg loss: 0.115004, Accuracy: 96.0% Epochs without improvement: 3 Epoch 25 ------------------------------- loss: 0.100523 [ 32/ 1200] loss: 0.074072 [ 160/ 1200] loss: 0.182006 [ 320/ 1200] loss: 0.081973 [ 480/ 1200] loss: 0.067353 [ 640/ 1200] loss: 0.183379 [ 800/ 1200] loss: 0.090170 [ 960/ 1200] loss: 0.158497 [ 1120/ 1200] loss: 0.129507 [ 1200/ 1200] Train loss: 0.083950, Accuracy: 97.2% Test Error: Avg loss: 0.084666, Accuracy: 97.0% Epochs without improvement: 1 Epoch 26 ------------------------------- loss: 0.046266 [ 32/ 1200] loss: 0.040291 [ 160/ 1200] loss: 0.089801 [ 320/ 1200] loss: 0.145453 [ 480/ 1200] loss: 0.034128 [ 640/ 1200] loss: 0.156608 [ 800/ 1200] loss: 0.047621 [ 960/ 1200] loss: 0.106920 [ 1120/ 1200] loss: 0.059001 [ 1200/ 1200] Train loss: 0.071368, Accuracy: 98.0% Test Error: Avg loss: 0.080202, Accuracy: 97.7% Epochs without improvement: 1 Epoch 27 ------------------------------- loss: 0.082863 [ 32/ 1200] loss: 0.010990 [ 160/ 1200] loss: 0.158283 [ 320/ 1200] loss: 0.162913 [ 480/ 1200] loss: 0.051533 [ 640/ 1200] loss: 0.121070 [ 800/ 1200] loss: 0.015259 [ 960/ 1200] loss: 0.046578 [ 1120/ 1200] loss: 0.015967 [ 1200/ 1200] Train loss: 0.058663, Accuracy: 98.2% Test Error: Avg loss: 0.079304, Accuracy: 97.7% Epochs without improvement: 1 Epoch 28 ------------------------------- loss: 0.049371 [ 32/ 1200] loss: 0.051845 [ 160/ 1200] loss: 0.052190 [ 320/ 1200] loss: 0.037314 [ 480/ 1200] loss: 0.022985 [ 640/ 1200] loss: 0.033883 [ 800/ 1200] loss: 0.024482 [ 960/ 1200] loss: 0.017938 [ 1120/ 1200] loss: 0.019321 [ 1200/ 1200] Train loss: 0.057352, Accuracy: 98.5% Test Error: Avg loss: 0.079725, Accuracy: 97.3% Epochs without improvement: 2 Epoch 29 ------------------------------- loss: 0.032844 [ 32/ 1200] loss: 0.008860 [ 160/ 1200] loss: 0.087181 [ 320/ 1200] loss: 0.037970 [ 480/ 1200] loss: 0.058756 [ 640/ 1200] loss: 0.005510 [ 800/ 1200] loss: 0.115826 [ 960/ 1200] loss: 0.015242 [ 1120/ 1200] loss: 0.023551 [ 1200/ 1200] Train loss: 0.049111, Accuracy: 98.8% Test Error: Avg loss: 0.090636, Accuracy: 97.0% Epochs without improvement: 3 Epoch 30 ------------------------------- loss: 0.082260 [ 32/ 1200] loss: 0.053773 [ 160/ 1200] loss: 0.076040 [ 320/ 1200] loss: 0.016323 [ 480/ 1200] loss: 0.101945 [ 640/ 1200] loss: 0.008519 [ 800/ 1200] loss: 0.015544 [ 960/ 1200] loss: 0.065944 [ 1120/ 1200] loss: 0.005586 [ 1200/ 1200] Train loss: 0.056390, Accuracy: 98.1% Test Error: Avg loss: 0.073030, Accuracy: 98.0% Epochs without improvement: 1 Epoch 31 ------------------------------- loss: 0.009967 [ 32/ 1200] loss: 0.017490 [ 160/ 1200] loss: 0.086233 [ 320/ 1200] loss: 0.008768 [ 480/ 1200] loss: 0.071747 [ 640/ 1200] loss: 0.042637 [ 800/ 1200] loss: 0.015538 [ 960/ 1200] loss: 0.004980 [ 1120/ 1200] loss: 0.054900 [ 1200/ 1200] Train loss: 0.046066, Accuracy: 98.8% Test Error: Avg loss: 0.109422, Accuracy: 95.7% Epochs without improvement: 2 Epoch 32 ------------------------------- loss: 0.108910 [ 32/ 1200] loss: 0.031064 [ 160/ 1200] loss: 0.010094 [ 320/ 1200] loss: 0.036454 [ 480/ 1200] loss: 0.048894 [ 640/ 1200] loss: 0.106291 [ 800/ 1200] loss: 0.007396 [ 960/ 1200] loss: 0.012896 [ 1120/ 1200] loss: 0.123573 [ 1200/ 1200] Train loss: 0.067408, Accuracy: 97.5% Test Error: Avg loss: 0.070819, Accuracy: 97.3% Epochs without improvement: 1 Epoch 33 ------------------------------- loss: 0.103984 [ 32/ 1200] loss: 0.011631 [ 160/ 1200] loss: 0.008946 [ 320/ 1200] loss: 0.026388 [ 480/ 1200] loss: 0.036525 [ 640/ 1200] loss: 0.017494 [ 800/ 1200] loss: 0.005056 [ 960/ 1200] loss: 0.015255 [ 1120/ 1200] loss: 0.003697 [ 1200/ 1200] Train loss: 0.056041, Accuracy: 98.4% Test Error: Avg loss: 0.064449, Accuracy: 97.7% Epochs without improvement: 1 Epoch 34 ------------------------------- loss: 0.015609 [ 32/ 1200] loss: 0.032148 [ 160/ 1200] loss: 0.007282 [ 320/ 1200] loss: 0.005343 [ 480/ 1200] loss: 0.118463 [ 640/ 1200] loss: 0.004808 [ 800/ 1200] loss: 0.027293 [ 960/ 1200] loss: 0.020904 [ 1120/ 1200] loss: 0.036531 [ 1200/ 1200] Train loss: 0.041759, Accuracy: 98.8% Test Error: Avg loss: 0.079946, Accuracy: 97.7% Epochs without improvement: 2 Epoch 35 ------------------------------- loss: 0.168372 [ 32/ 1200] loss: 0.021651 [ 160/ 1200] loss: 0.034423 [ 320/ 1200] loss: 0.016483 [ 480/ 1200] loss: 0.082766 [ 640/ 1200] loss: 0.060544 [ 800/ 1200] loss: 0.047634 [ 960/ 1200] loss: 0.064484 [ 1120/ 1200] loss: 0.162962 [ 1200/ 1200] Train loss: 0.044576, Accuracy: 98.8% Test Error: Avg loss: 0.056666, Accuracy: 98.0% Epochs without improvement: 1 Epoch 36 ------------------------------- loss: 0.086165 [ 32/ 1200] loss: 0.048654 [ 160/ 1200] loss: 0.010367 [ 320/ 1200] loss: 0.021828 [ 480/ 1200] loss: 0.112453 [ 640/ 1200] loss: 0.113652 [ 800/ 1200] loss: 0.013788 [ 960/ 1200] loss: 0.018346 [ 1120/ 1200] loss: 0.002213 [ 1200/ 1200] Train loss: 0.043629, Accuracy: 98.8% Test Error: Avg loss: 0.053757, Accuracy: 98.7% Epochs without improvement: 1 Epoch 37 ------------------------------- loss: 0.071868 [ 32/ 1200] loss: 0.014995 [ 160/ 1200] loss: 0.081550 [ 320/ 1200] loss: 0.036406 [ 480/ 1200] loss: 0.016753 [ 640/ 1200] loss: 0.162450 [ 800/ 1200] loss: 0.106039 [ 960/ 1200] loss: 0.011425 [ 1120/ 1200] loss: 0.005053 [ 1200/ 1200] Train loss: 0.044404, Accuracy: 98.7% Test Error: Avg loss: 0.076989, Accuracy: 97.3% Epochs without improvement: 2 Epoch 38 ------------------------------- loss: 0.206524 [ 32/ 1200] loss: 0.001673 [ 160/ 1200] loss: 0.163483 [ 320/ 1200] loss: 0.089862 [ 480/ 1200] loss: 0.177988 [ 640/ 1200] loss: 0.046204 [ 800/ 1200] loss: 0.125248 [ 960/ 1200] loss: 0.005807 [ 1120/ 1200] loss: 0.268768 [ 1200/ 1200] Train loss: 0.055535, Accuracy: 98.0% Test Error: Avg loss: 0.061366, Accuracy: 97.3% Epochs without improvement: 3 Epoch 39 ------------------------------- loss: 0.003071 [ 32/ 1200] loss: 0.020929 [ 160/ 1200] loss: 0.034844 [ 320/ 1200] loss: 0.041652 [ 480/ 1200] loss: 0.020903 [ 640/ 1200] loss: 0.103893 [ 800/ 1200] loss: 0.007626 [ 960/ 1200] loss: 0.076357 [ 1120/ 1200] loss: 0.001390 [ 1200/ 1200] Train loss: 0.037344, Accuracy: 99.0% Test Error: Avg loss: 0.044911, Accuracy: 99.0% Epochs without improvement: 1 Epoch 40 ------------------------------- loss: 0.002676 [ 32/ 1200] loss: 0.066251 [ 160/ 1200] loss: 0.019777 [ 320/ 1200] loss: 0.020698 [ 480/ 1200] loss: 0.039842 [ 640/ 1200] loss: 0.065602 [ 800/ 1200] loss: 0.013986 [ 960/ 1200] loss: 0.037509 [ 1120/ 1200] loss: 0.001278 [ 1200/ 1200] Train loss: 0.040149, Accuracy: 98.7% Test Error: Avg loss: 0.054262, Accuracy: 98.0% Epochs without improvement: 2 Epoch 41 ------------------------------- loss: 0.002546 [ 32/ 1200] loss: 0.031648 [ 160/ 1200] loss: 0.021873 [ 320/ 1200] loss: 0.108578 [ 480/ 1200] loss: 0.004663 [ 640/ 1200] loss: 0.130175 [ 800/ 1200] loss: 0.033512 [ 960/ 1200] loss: 0.021728 [ 1120/ 1200] loss: 0.081582 [ 1200/ 1200] Train loss: 0.041842, Accuracy: 98.9% Test Error: Avg loss: 0.045009, Accuracy: 99.0% Epochs without improvement: 3 Epoch 42 ------------------------------- loss: 0.001665 [ 32/ 1200] loss: 0.015261 [ 160/ 1200] loss: 0.022316 [ 320/ 1200] loss: 0.033085 [ 480/ 1200] loss: 0.019073 [ 640/ 1200] loss: 0.054670 [ 800/ 1200] loss: 0.014871 [ 960/ 1200] loss: 0.005216 [ 1120/ 1200] loss: 0.035957 [ 1200/ 1200] Train loss: 0.039039, Accuracy: 98.6% Test Error: Avg loss: 0.042110, Accuracy: 98.3% Epochs without improvement: 1 Epoch 43 ------------------------------- loss: 0.017276 [ 32/ 1200] loss: 0.059321 [ 160/ 1200] loss: 0.002669 [ 320/ 1200] loss: 0.078319 [ 480/ 1200] loss: 0.104611 [ 640/ 1200] loss: 0.084913 [ 800/ 1200] loss: 0.014499 [ 960/ 1200] loss: 0.073456 [ 1120/ 1200] loss: 0.005112 [ 1200/ 1200] Train loss: 0.035320, Accuracy: 98.8% Test Error: Avg loss: 0.058085, Accuracy: 98.3% Epochs without improvement: 2 Epoch 44 ------------------------------- loss: 0.018703 [ 32/ 1200] loss: 0.102692 [ 160/ 1200] loss: 0.018073 [ 320/ 1200] loss: 0.011940 [ 480/ 1200] loss: 0.002803 [ 640/ 1200] loss: 0.019762 [ 800/ 1200] loss: 0.050111 [ 960/ 1200] loss: 0.035685 [ 1120/ 1200] loss: 0.047037 [ 1200/ 1200] Train loss: 0.044721, Accuracy: 98.7% Test Error: Avg loss: 0.047990, Accuracy: 98.7% Epochs without improvement: 3 Epoch 45 ------------------------------- loss: 0.021761 [ 32/ 1200] loss: 0.024765 [ 160/ 1200] loss: 0.126123 [ 320/ 1200] loss: 0.031635 [ 480/ 1200] loss: 0.019641 [ 640/ 1200] loss: 0.001573 [ 800/ 1200] loss: 0.003827 [ 960/ 1200] loss: 0.117278 [ 1120/ 1200] loss: 0.137728 [ 1200/ 1200] Train loss: 0.041787, Accuracy: 98.8% Test Error: Avg loss: 0.054772, Accuracy: 98.3% Epochs without improvement: 4 Epoch 46 ------------------------------- loss: 0.002151 [ 32/ 1200] loss: 0.006367 [ 160/ 1200] loss: 0.018258 [ 320/ 1200] loss: 0.058325 [ 480/ 1200] loss: 0.026320 [ 640/ 1200] loss: 0.038759 [ 800/ 1200] loss: 0.160399 [ 960/ 1200] loss: 0.002587 [ 1120/ 1200] loss: 0.005944 [ 1200/ 1200] Train loss: 0.030041, Accuracy: 99.2% Test Error: Avg loss: 0.058644, Accuracy: 97.7% Epochs without improvement: 5 Epoch 47 ------------------------------- loss: 0.039897 [ 32/ 1200] loss: 0.047978 [ 160/ 1200] loss: 0.008686 [ 320/ 1200] loss: 0.004509 [ 480/ 1200] loss: 0.005692 [ 640/ 1200] loss: 0.014400 [ 800/ 1200] loss: 0.033052 [ 960/ 1200] loss: 0.034523 [ 1120/ 1200] loss: 0.178733 [ 1200/ 1200] Train loss: 0.037962, Accuracy: 98.8% Test Error: Avg loss: 0.051239, Accuracy: 98.3% Epochs without improvement: 6 Epoch 48 ------------------------------- loss: 0.027276 [ 32/ 1200] loss: 0.043168 [ 160/ 1200] loss: 0.001433 [ 320/ 1200] loss: 0.024972 [ 480/ 1200] loss: 0.047565 [ 640/ 1200] loss: 0.003666 [ 800/ 1200] loss: 0.124968 [ 960/ 1200] loss: 0.059083 [ 1120/ 1200] loss: 0.250772 [ 1200/ 1200] Train loss: 0.047724, Accuracy: 98.6% Test Error: Avg loss: 0.071371, Accuracy: 97.3% Epochs without improvement: 7 Epoch 49 ------------------------------- loss: 0.167829 [ 32/ 1200] loss: 0.020035 [ 160/ 1200] loss: 0.128060 [ 320/ 1200] loss: 0.013263 [ 480/ 1200] loss: 0.079710 [ 640/ 1200] loss: 0.032987 [ 800/ 1200] loss: 0.004352 [ 960/ 1200] loss: 0.015335 [ 1120/ 1200] loss: 0.006766 [ 1200/ 1200] Train loss: 0.046962, Accuracy: 98.2% Test Error: Avg loss: 0.121998, Accuracy: 94.0% Epochs without improvement: 8 Epoch 50 ------------------------------- loss: 0.142457 [ 32/ 1200] loss: 0.025274 [ 160/ 1200] loss: 0.078302 [ 320/ 1200] loss: 0.126697 [ 480/ 1200] loss: 0.032256 [ 640/ 1200] loss: 0.036538 [ 800/ 1200] loss: 0.004753 [ 960/ 1200] loss: 0.003450 [ 1120/ 1200] loss: 0.035040 [ 1200/ 1200] Train loss: 0.051739, Accuracy: 98.1% Test Error: Avg loss: 0.041489, Accuracy: 99.0% Epochs without improvement: 1 Epoch 51 ------------------------------- loss: 0.111955 [ 32/ 1200] loss: 0.044337 [ 160/ 1200] loss: 0.023094 [ 320/ 1200] loss: 0.041527 [ 480/ 1200] loss: 0.061358 [ 640/ 1200] loss: 0.011867 [ 800/ 1200] loss: 0.001245 [ 960/ 1200] loss: 0.014972 [ 1120/ 1200] loss: 0.003035 [ 1200/ 1200] Train loss: 0.029646, Accuracy: 99.2% Test Error: Avg loss: 0.038868, Accuracy: 98.0% Epochs without improvement: 1 Epoch 52 ------------------------------- loss: 0.023232 [ 32/ 1200] loss: 0.001326 [ 160/ 1200] loss: 0.010747 [ 320/ 1200] loss: 0.011912 [ 480/ 1200] loss: 0.052980 [ 640/ 1200] loss: 0.065411 [ 800/ 1200] loss: 0.025955 [ 960/ 1200] loss: 0.001330 [ 1120/ 1200] loss: 0.051262 [ 1200/ 1200] Train loss: 0.030117, Accuracy: 99.2% Test Error: Avg loss: 0.041166, Accuracy: 98.0% Epochs without improvement: 2 Epoch 53 ------------------------------- loss: 0.051811 [ 32/ 1200] loss: 0.079863 [ 160/ 1200] loss: 0.099679 [ 320/ 1200] loss: 0.018016 [ 480/ 1200] loss: 0.105570 [ 640/ 1200] loss: 0.029669 [ 800/ 1200] loss: 0.091292 [ 960/ 1200] loss: 0.025268 [ 1120/ 1200] loss: 0.011846 [ 1200/ 1200] Train loss: 0.037327, Accuracy: 98.7% Test Error: Avg loss: 0.039990, Accuracy: 99.3% Epochs without improvement: 3 Epoch 54 ------------------------------- loss: 0.001528 [ 32/ 1200] loss: 0.002074 [ 160/ 1200] loss: 0.005081 [ 320/ 1200] loss: 0.051016 [ 480/ 1200] loss: 0.084545 [ 640/ 1200] loss: 0.027531 [ 800/ 1200] loss: 0.007653 [ 960/ 1200] loss: 0.030064 [ 1120/ 1200] loss: 0.018458 [ 1200/ 1200] Train loss: 0.035666, Accuracy: 98.9% Test Error: Avg loss: 0.053359, Accuracy: 97.7% Epochs without improvement: 4 Epoch 55 ------------------------------- loss: 0.086437 [ 32/ 1200] loss: 0.005623 [ 160/ 1200] loss: 0.060756 [ 320/ 1200] loss: 0.007930 [ 480/ 1200] loss: 0.001279 [ 640/ 1200] loss: 0.060474 [ 800/ 1200] loss: 0.004860 [ 960/ 1200] loss: 0.030017 [ 1120/ 1200] loss: 0.005950 [ 1200/ 1200] Train loss: 0.030254, Accuracy: 98.9% Test Error: Avg loss: 0.048697, Accuracy: 98.3% Epochs without improvement: 5 Epoch 56 ------------------------------- loss: 0.046880 [ 32/ 1200] loss: 0.006962 [ 160/ 1200] loss: 0.028615 [ 320/ 1200] loss: 0.032003 [ 480/ 1200] loss: 0.001176 [ 640/ 1200] loss: 0.004727 [ 800/ 1200] loss: 0.041096 [ 960/ 1200] loss: 0.003231 [ 1120/ 1200] loss: 0.001548 [ 1200/ 1200] Train loss: 0.025989, Accuracy: 99.2% Test Error: Avg loss: 0.039794, Accuracy: 98.3% Epochs without improvement: 6 Epoch 57 ------------------------------- loss: 0.027854 [ 32/ 1200] loss: 0.016113 [ 160/ 1200] loss: 0.004983 [ 320/ 1200] loss: 0.006536 [ 480/ 1200] loss: 0.005543 [ 640/ 1200] loss: 0.108778 [ 800/ 1200] loss: 0.072802 [ 960/ 1200] loss: 0.004151 [ 1120/ 1200] loss: 0.001613 [ 1200/ 1200] Train loss: 0.033686, Accuracy: 98.7% Test Error: Avg loss: 0.058687, Accuracy: 98.0% Epochs without improvement: 7 Epoch 58 ------------------------------- loss: 0.024724 [ 32/ 1200] loss: 0.004410 [ 160/ 1200] loss: 0.037158 [ 320/ 1200] loss: 0.114128 [ 480/ 1200] loss: 0.019958 [ 640/ 1200] loss: 0.007023 [ 800/ 1200] loss: 0.033113 [ 960/ 1200] loss: 0.066110 [ 1120/ 1200] loss: 0.012325 [ 1200/ 1200] Train loss: 0.032287, Accuracy: 98.8% Test Error: Avg loss: 0.070833, Accuracy: 97.0% Epochs without improvement: 8 Epoch 59 ------------------------------- loss: 0.004713 [ 32/ 1200] loss: 0.006559 [ 160/ 1200] loss: 0.108453 [ 320/ 1200] loss: 0.001160 [ 480/ 1200] loss: 0.094993 [ 640/ 1200] loss: 0.002221 [ 800/ 1200] loss: 0.005217 [ 960/ 1200] loss: 0.000800 [ 1120/ 1200] loss: 0.052071 [ 1200/ 1200] Train loss: 0.045157, Accuracy: 98.3% Test Error: Avg loss: 0.092795, Accuracy: 96.0% Epochs without improvement: 9 Epoch 60 ------------------------------- loss: 0.013907 [ 32/ 1200] loss: 0.077066 [ 160/ 1200] loss: 0.102597 [ 320/ 1200] loss: 0.044848 [ 480/ 1200] loss: 0.045275 [ 640/ 1200] loss: 0.045851 [ 800/ 1200] loss: 0.030483 [ 960/ 1200] loss: 0.008646 [ 1120/ 1200] loss: 0.000378 [ 1200/ 1200] Train loss: 0.029209, Accuracy: 99.1% Test Error: Avg loss: 0.038396, Accuracy: 99.0% Epochs without improvement: 1 Epoch 61 ------------------------------- loss: 0.005175 [ 32/ 1200] loss: 0.000340 [ 160/ 1200] loss: 0.033532 [ 320/ 1200] loss: 0.011765 [ 480/ 1200] loss: 0.024099 [ 640/ 1200] loss: 0.009380 [ 800/ 1200] loss: 0.035982 [ 960/ 1200] loss: 0.013744 [ 1120/ 1200] loss: 0.068872 [ 1200/ 1200] Train loss: 0.028687, Accuracy: 98.9% Test Error: Avg loss: 0.039856, Accuracy: 99.0% Epochs without improvement: 2 Epoch 62 ------------------------------- loss: 0.021683 [ 32/ 1200] loss: 0.009609 [ 160/ 1200] loss: 0.034431 [ 320/ 1200] loss: 0.004877 [ 480/ 1200] loss: 0.003557 [ 640/ 1200] loss: 0.000897 [ 800/ 1200] loss: 0.085878 [ 960/ 1200] loss: 0.100189 [ 1120/ 1200] loss: 0.042491 [ 1200/ 1200] Train loss: 0.028550, Accuracy: 99.2% Test Error: Avg loss: 0.066530, Accuracy: 97.3% Epochs without improvement: 3 Epoch 63 ------------------------------- loss: 0.001216 [ 32/ 1200] loss: 0.061681 [ 160/ 1200] loss: 0.073598 [ 320/ 1200] loss: 0.055796 [ 480/ 1200] loss: 0.015083 [ 640/ 1200] loss: 0.098730 [ 800/ 1200] loss: 0.009850 [ 960/ 1200] loss: 0.002639 [ 1120/ 1200] loss: 0.000310 [ 1200/ 1200] Train loss: 0.038474, Accuracy: 98.5% Test Error: Avg loss: 0.032664, Accuracy: 99.0% Epochs without improvement: 1 Epoch 64 ------------------------------- loss: 0.099990 [ 32/ 1200] loss: 0.001848 [ 160/ 1200] loss: 0.017908 [ 320/ 1200] loss: 0.177410 [ 480/ 1200] loss: 0.002559 [ 640/ 1200] loss: 0.143031 [ 800/ 1200] loss: 0.129413 [ 960/ 1200] loss: 0.010796 [ 1120/ 1200] loss: 0.008519 [ 1200/ 1200] Train loss: 0.039543, Accuracy: 98.4% Test Error: Avg loss: 0.090810, Accuracy: 96.7% Epochs without improvement: 2 Epoch 65 ------------------------------- loss: 0.195597 [ 32/ 1200] loss: 0.010062 [ 160/ 1200] loss: 0.178570 [ 320/ 1200] loss: 0.027423 [ 480/ 1200] loss: 0.097403 [ 640/ 1200] loss: 0.037959 [ 800/ 1200] loss: 0.000849 [ 960/ 1200] loss: 0.073718 [ 1120/ 1200] loss: 0.000736 [ 1200/ 1200] Train loss: 0.031769, Accuracy: 99.2% Test Error: Avg loss: 0.042636, Accuracy: 98.3% Epochs without improvement: 3 Epoch 66 ------------------------------- loss: 0.027858 [ 32/ 1200] loss: 0.000856 [ 160/ 1200] loss: 0.000791 [ 320/ 1200] loss: 0.010801 [ 480/ 1200] loss: 0.002839 [ 640/ 1200] loss: 0.038085 [ 800/ 1200] loss: 0.079506 [ 960/ 1200] loss: 0.002746 [ 1120/ 1200] loss: 0.051221 [ 1200/ 1200] Train loss: 0.034148, Accuracy: 98.8% Test Error: Avg loss: 0.068834, Accuracy: 97.7% Epochs without improvement: 4 Epoch 67 ------------------------------- loss: 0.017481 [ 32/ 1200] loss: 0.002969 [ 160/ 1200] loss: 0.027781 [ 320/ 1200] loss: 0.004926 [ 480/ 1200] loss: 0.018719 [ 640/ 1200] loss: 0.003306 [ 800/ 1200] loss: 0.009174 [ 960/ 1200] loss: 0.036197 [ 1120/ 1200] loss: 0.001672 [ 1200/ 1200] Train loss: 0.026885, Accuracy: 99.1% Test Error: Avg loss: 0.038760, Accuracy: 99.0% Epochs without improvement: 5 Epoch 68 ------------------------------- loss: 0.004085 [ 32/ 1200] loss: 0.010015 [ 160/ 1200] loss: 0.001365 [ 320/ 1200] loss: 0.026532 [ 480/ 1200] loss: 0.129795 [ 640/ 1200] loss: 0.001214 [ 800/ 1200] loss: 0.012976 [ 960/ 1200] loss: 0.002772 [ 1120/ 1200] loss: 0.049484 [ 1200/ 1200] Train loss: 0.025784, Accuracy: 98.9% Test Error: Avg loss: 0.044448, Accuracy: 98.7% Epochs without improvement: 6 Epoch 69 ------------------------------- loss: 0.061515 [ 32/ 1200] loss: 0.008297 [ 160/ 1200] loss: 0.005122 [ 320/ 1200] loss: 0.030367 [ 480/ 1200] loss: 0.004813 [ 640/ 1200] loss: 0.001399 [ 800/ 1200] loss: 0.001552 [ 960/ 1200] loss: 0.058874 [ 1120/ 1200] loss: 0.010766 [ 1200/ 1200] Train loss: 0.031174, Accuracy: 98.8% Test Error: Avg loss: 0.061982, Accuracy: 97.0% Epochs without improvement: 7 Epoch 70 ------------------------------- loss: 0.024372 [ 32/ 1200] loss: 0.073436 [ 160/ 1200] loss: 0.022486 [ 320/ 1200] loss: 0.103297 [ 480/ 1200] loss: 0.007298 [ 640/ 1200] loss: 0.020005 [ 800/ 1200] loss: 0.005521 [ 960/ 1200] loss: 0.120259 [ 1120/ 1200] loss: 0.016034 [ 1200/ 1200] Train loss: 0.036680, Accuracy: 98.8% Test Error: Avg loss: 0.053770, Accuracy: 98.0% Epochs without improvement: 8 Epoch 71 ------------------------------- loss: 0.089758 [ 32/ 1200] loss: 0.013916 [ 160/ 1200] loss: 0.018616 [ 320/ 1200] loss: 0.024196 [ 480/ 1200] loss: 0.023773 [ 640/ 1200] loss: 0.032440 [ 800/ 1200] loss: 0.166427 [ 960/ 1200] loss: 0.000779 [ 1120/ 1200] loss: 0.001644 [ 1200/ 1200] Train loss: 0.037168, Accuracy: 98.8% Test Error: Avg loss: 0.044975, Accuracy: 98.3% Epochs without improvement: 9 Epoch 72 ------------------------------- loss: 0.002900 [ 32/ 1200] loss: 0.001028 [ 160/ 1200] loss: 0.023630 [ 320/ 1200] loss: 0.066303 [ 480/ 1200] loss: 0.002136 [ 640/ 1200] loss: 0.026897 [ 800/ 1200] loss: 0.001560 [ 960/ 1200] loss: 0.089822 [ 1120/ 1200] loss: 0.101744 [ 1200/ 1200] Train loss: 0.051376, Accuracy: 98.0% Test Error: Avg loss: 0.037899, Accuracy: 98.7% Epochs without improvement: 10 Epoch 73 ------------------------------- loss: 0.018171 [ 32/ 1200] loss: 0.100454 [ 160/ 1200] loss: 0.017810 [ 320/ 1200] loss: 0.038299 [ 480/ 1200] loss: 0.035341 [ 640/ 1200] loss: 0.020276 [ 800/ 1200] loss: 0.124847 [ 960/ 1200] loss: 0.002397 [ 1120/ 1200] loss: 0.015655 [ 1200/ 1200] Train loss: 0.042475, Accuracy: 98.6% Test Error: Avg loss: 0.082502, Accuracy: 96.3% Early stopping triggered after 74 epochs. Training completed.
res = 100
x,y=np.meshgrid(np.linspace(-1,1,res),np.linspace(-1,1,res))
xy=np.concatenate((x.reshape(-1,1),y.reshape(-1,1)),axis=1)
z=model(torch.tensor(xy).float()).detach().numpy()
z=np.argmax(z,1).reshape(res,res)
plt.contourf(x,y,z)
plt.scatter(training_data[:,0],training_data[:,1],c=training_labels)
<matplotlib.collections.PathCollection at 0x26992627e20>
- Describe your results and discuss the observed performance
The network is able to reach an accuracy of about 97-98% within the 100 epochs. The network often converges after 50-60 epochs with the stopping parameters i've set. However, the last few percentages are hard to reach due to how our data is centered and tightly clustered about (0,0). - Visualize network performance
The visualization shows the trouble with classifying the center data, nevertheless, we still see good seperation between the 5 classes.